智能论文笔记

Deep Bayesian Active-Learning-to-Rank for Endoscopic Image Data

Takeaki Kadota , Hideaki Hayashi , Ryoma Bise , Kiyohito Tanaka , Seiichi Uchida

分类：计算机视觉

2022-08-05

自动基于图像的疾病严重程度估计通常使用离散（即量化）严重性标签。由于图像含糊不清，因此通常很难注释离散标签。一个更容易的替代方法是使用相对注释，该注释比较图像对之间的严重程度。通过使用带有相对注释的学习对框架，我们可以训练一个神经网络，该神经网络估计与严重程度相关的等级分数。但是，所有可能对的相对注释都是过敏的，因此，适当的样品对选择是强制性的。本文提出了深层贝叶斯的主动学习与级别，该级别训练贝叶斯卷积神经网络，同时自动选择合适的对进行相对注释。我们通过对溃疡性结肠炎的内窥镜图像进行实验证实了该方法的效率。此外，我们确认我们的方法即使在严重的类失衡中也很有用，因为它可以自动从次要类中选择样本。

translated by 谷歌翻译

Order-Guided Disentangled Representation Learning for Ulcerative Colitis Classification with Limited Labels

Shota Harada , Ryoma Bise , Hideaki Hayashi , Kiyohito Tanaka , Seiichi Uchida

分类：计算机视觉

2021-11-06

溃疡性结肠炎（UC）分类，是内窥镜诊断的重要任务，涉及两个主要困难。首先，具有关于UC（正或负）注释的内窥镜图像通常是有限的。其次，由于冒号中的位置，它们在外观上显示出大的变化。特别是，第二个困难阻止了我们使用现有的半监督学习技术，这是第一个难度的常见补救措施。在本文中，我们通过新利用两个附加特征，提出了一种用于UC分类的实际半监督学习方法，结肠中的位置（例如，左冒号）和图像捕获顺序，两者通常都附加到内窥镜中的各个图像图像序列。该方法可以通过与这些功能有效地提取UC分类的基本信息。实验结果表明，所提出的方法在分类任务中优于若干现有的半监督学习方法，即使具有少量注释的图像。

translated by 谷歌翻译

Gaussian Process Classification Bandits

Tatsuya Hayashi , Naoki Ito , Koji Tabata , Atsuyoshi Nakamura , Katsumasa Fujita , Yoshinori Harada , Tamiki Komatsuzaki

分类：机器学习

2022-12-26

Classification bandits are multi-armed bandit problems whose task is to classify a given set of arms into either positive or negative class depending on whether the rate of the arms with the expected reward of at least h is not less than w for given thresholds h and w. We study a special classification bandit problem in which arms correspond to points x in d-dimensional real space with expected rewards f(x) which are generated according to a Gaussian process prior. We develop a framework algorithm for the problem using various arm selection policies and propose policies called FCB and FTSV. We show a smaller sample complexity upper bound for FCB than that for the existing algorithm of the level set estimation, in which whether f(x) is at least h or not must be decided for every arm's x. Arm selection policies depending on an estimated rate of arms with rewards of at least h are also proposed and shown to improve empirical sample complexity. According to our experimental results, the rate-estimation versions of FCB and FTSV, together with that of the popular active learning policy that selects the point with the maximum variance, outperform other policies for synthetic functions, and the version of FTSV is also the best performer for our real-world dataset.

translated by 谷歌翻译

Fully 3D Implementation of the End-to-end Deep Image Prior-based PET Image Reconstruction Using Block Iterative Algorithm

Fumio Hashimoto , Yuya Onishi , Kibo Ote , Hideaki Tashima , Taiga Yamaya

分类：计算机视觉 | 机器学习

2022-12-22

Deep image prior (DIP) has recently attracted attention owing to its unsupervised positron emission tomography (PET) image reconstruction, which does not require any prior training dataset. In this paper, we present the first attempt to implement an end-to-end DIP-based fully 3D PET image reconstruction method that incorporates a forward-projection model into a loss function. To implement a practical fully 3D PET image reconstruction, which could not be performed due to a graphics processing unit memory limitation, we modify the DIP optimization to block-iteration and sequentially learn an ordered sequence of block sinograms. Furthermore, the relative difference penalty (RDP) term was added to the loss function to enhance the quantitative PET image accuracy. We evaluated our proposed method using Monte Carlo simulation with [$^{18}$F]FDG PET data of a human brain and a preclinical study on monkey brain [$^{18}$F]FDG PET data. The proposed method was compared with the maximum-likelihood expectation maximization (EM), maximum-a-posterior EM with RDP, and hybrid DIP-based PET reconstruction methods. The simulation results showed that the proposed method improved the PET image quality by reducing statistical noise and preserved a contrast of brain structures and inserted tumor compared with other algorithms. In the preclinical experiment, finer structures and better contrast recovery were obtained by the proposed method. This indicated that the proposed method can produce high-quality images without a prior training dataset. Thus, the proposed method is a key enabling technology for the straightforward and practical implementation of end-to-end DIP-based fully 3D PET image reconstruction.

translated by 谷歌翻译

Local Differential Privacy Image Generation Using Flow-based Deep Generative Models

Hisaichi Shibata , Shouhei Hanaoka , Yang Cao , Masatoshi Yoshikawa , Tomomi Takenaga , Yukihiro Nomura , Naoto Hayashi , Osamu Abe

分类：计算机视觉

2022-12-20

Diagnostic radiologists need artificial intelligence (AI) for medical imaging, but access to medical images required for training in AI has become increasingly restrictive. To release and use medical images, we need an algorithm that can simultaneously protect privacy and preserve pathologies in medical images. To develop such an algorithm, here, we propose DP-GLOW, a hybrid of a local differential privacy (LDP) algorithm and one of the flow-based deep generative models (GLOW). By applying a GLOW model, we disentangle the pixelwise correlation of images, which makes it difficult to protect privacy with straightforward LDP algorithms for images. Specifically, we map images onto the latent vector of the GLOW model, each element of which follows an independent normal distribution, and we apply the Laplace mechanism to the latent vector. Moreover, we applied DP-GLOW to chest X-ray images to generate LDP images while preserving pathologies.

translated by 谷歌翻译

Slimmable Pruned Neural Networks

Hideaki Kuratsu , Atsuyoshi Nakamura

分类：计算机视觉

2022-12-07

Slimmable Neural Networks (S-Net) is a novel network which enabled to select one of the predefined proportions of channels (sub-network) dynamically depending on the current computational resource availability. The accuracy of each sub-network on S-Net, however, is inferior to that of individually trained networks of the same size due to its difficulty of simultaneous optimization on different sub-networks. In this paper, we propose Slimmable Pruned Neural Networks (SP-Net), which has sub-network structures learned by pruning instead of adopting structures with the same proportion of channels in each layer (width multiplier) like S-Net, and we also propose new pruning procedures: multi-base pruning instead of one-shot or iterative pruning to realize high accuracy and huge training time saving. We also introduced slimmable channel sorting (scs) to achieve calculation as fast as S-Net and zero padding match (zpm) pruning to prune residual structure in more efficient way. SP-Net can be combined with any kind of channel pruning methods and does not require any complicated processing or time-consuming architecture search like NAS models. Compared with each sub-network of the same FLOPs on S-Net, SP-Net improves accuracy by 1.2-1.5% for ResNet-50, 0.9-4.4% for VGGNet, 1.3-2.7% for MobileNetV1, 1.4-3.1% for MobileNetV2 on ImageNet. Furthermore, our methods outperform other SOTA pruning methods and are on par with various NAS models according to our experimental results on ImageNet. The code is available at https://github.com/hideakikuratsu/SP-Net.

translated by 谷歌翻译

Aging prediction using deep generative model toward the development of preventive medicine

Hisaichi Shibata , Shouhei Hanaoka , Yukihiro Nomura , Naoto Hayashi , Osamu Abe

分类：计算机视觉

2022-08-23

从出生到死亡，由于老化，我们都经历了令人惊讶的无处不在的变化。如果我们可以预测数字领域的衰老，即人体的数字双胞胎，我们将能够在很早的阶段检测病变，从而提高生活质量并延长寿命。我们观察到，没有一个先前开发的成年人体数字双胞胎在具有深层生成模型的体积医学图像之间明确训练的纵向转换规则，可能导致例如心室体积的预测性能不佳。在这里，我们建立了一个新的成人人体的数字双胞胎，该数字双胞胎采用纵向获得的头部计算机断层扫描（CT）图像进行训练，从而从一个当前的体积头CT图像中预测了未来的体积头CT图像。我们首次采用了三维基于流动的深层生成模型之一，以实现这种顺序的三维数字双胞胎。我们表明，我们的数字双胞胎在相对较短的程度上优于预测心室体积的最新方法。

translated by 谷歌翻译

Critical Bach Size Minimizes Stochastic First-Order Oracle Complexity of Deep Learning Optimizer using Hyperparameters Close to One

Hideaki Iiduka

分类：机器学习

2022-08-21

实际结果表明，使用较小的恒定学习速率，接近一个的超参数的深度学习优化者，大批量大小可以找到最小化损失功能的深神经网络的模型参数。我们首先显示了理论上的证据，即动量方法（动量）和自适应力矩估计（ADAM）的表现很好，即理论表现度量的上限很小，恒定学习率很小，超级参数接近一个，并且是一个大的。批量大小。接下来，我们证明存在一个批处理大小，称为关键批次尺寸最小化随机的甲骨文（SFO）复杂性，这是随机梯度计算成本，一旦批次大小超过关键批次大小，SFO的复杂性就会增加。最后，我们提供了支持我们理论结果的数值结果。也就是说，数值结果表明，ADAM使用较小的恒定学习率，接近一个的超参数和最小化SFO复杂性的临界批次大小比动量和随机梯度下降（SGD）更快。

translated by 谷歌翻译

A Comparative Study of Self-supervised Speech Representation Based Voice Conversion

Wen-Chin Huang , Shu-Wen Yang , Tomoki Hayashi , Tomoki Toda

分类：机器学习

2022-07-10

我们提出了一项对基于自我监督的语音表示（S3R）语音转换（VC）的大规模比较研究。在识别合成VC的背景下，S3RS由于其替代昂贵的监督表示的潜力，例如语音后验（PPG），因此很有吸引力，这些表示是由最先进的VC系统采用的。使用先前开发的开源VC软件S3PRL-VC，我们在三种VC设置下提供了一系列深入的目标和主观分析：内部/跨语义的任何一对一（A2O）和任何对象 - 使用语音转换挑战2020（VCC2020）数据集。我们在各个方面研究了基于S3R的VC，包括模型类型，多语言和监督。我们还研究了通过K-均值聚类的滴定过程的效果，并展示了其在A2A设置中的改进。最后，与最先进的VC系统的比较证明了基于S3R的VC的竞争力，并阐明了可能的改进方向。

translated by 谷歌翻译

Theoretical analysis of Adam using hyperparameters close to one without Lipschitz smoothness

Hideaki Iiduka

分类：机器学习

2022-06-27

自适应方法（例如自适应力矩估计（ADAM）及其变体）的收敛性和收敛速率分析已被广泛研究以进行非convex优化。分析基于假设，即预期或经验的平均损失函数是Lipschitz平滑的（即其梯度是Lipschitz的连续），并且学习率取决于Lipschitz连续梯度的Lipschitz常数。同时，对亚当及其变体的数值评估已经澄清说，使用较小的恒定学习速率而不依赖Lipschitz常数和超级参数（$ \ beta_1 $和$ \ beta_2 $）接近一个是有利的，这对于训练深神经网络是有利的。由于计算Lipschitz常数为NP-HARD，因此Lipschitz的平滑度条件是不现实的。本文提供了亚当的理论分析，而没有假设Lipschitz的平滑度条件，以弥合理论和实践之间的差距。主要的贡献是显示理论证据表明，亚当使用较小的学习率和接近一个的超级参数表现良好，而先前的理论结果全部用于接近零的超参数。我们的分析还导致发现亚当在大批量尺寸方面表现良好。此外，我们表明，当亚当使用学习率降低和接近一个的超级参数时，它的表现良好。

translated by 谷歌翻译